Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integration for web-poet's support on additional requests and Meta #62

Merged
merged 77 commits into from
Jun 16, 2022

Conversation

BurnzZ
Copy link
Contributor

@BurnzZ BurnzZ commented Feb 8, 2022

Requires the following to be merged in web-poet:

This is built on top of

Progress

  • Testing across a wide-variety of scenarios/cases
    • contextvars and Twisted
    • scrapy-poet and scrapy integration
  • Docs
    • update doc links using intersphinx-mapping when web-poet is released.
  • update error handling after Raise exceptions when receiving 400-5xx responses web-poet#38 is merged
    • ensure that Scrapy doesn't redirect on HEAD requests (reference)
  • Tests
    • Add test to verify that additional requests work both with an asyncio and a non-asyncio Twisted reactor.
  • Changelog
  • Use dont_filter for additional requests
  • Document that spider middlewares are not followed for additional requests, while downloader middlewares are.
  • Once there is a web-poet release with the new page input classes, like HttpResponse:
    • Update minimum dependencies web-poet>=0.2.0 and scrapy>=2.6.0 in setup.py and tox.ini

Fixes #65.

@BurnzZ BurnzZ self-assigned this Feb 8, 2022
@codecov
Copy link

codecov bot commented Feb 8, 2022

Codecov Report

Merging #62 (c0eddf3) into master (53e5b92) will increase coverage by 0.29%.
The diff coverage is 100.00%.

❗ Current head c0eddf3 differs from pull request most recent head 98ce454. Consider uploading reports for the commit 98ce454 to get more accurate results

@@            Coverage Diff             @@
##           master      #62      +/-   ##
==========================================
+ Coverage   97.46%   97.76%   +0.29%     
==========================================
  Files           9       10       +1     
  Lines         395      448      +53     
==========================================
+ Hits          385      438      +53     
  Misses         10       10              
Impacted Files Coverage Δ
scrapy_poet/api.py 100.00% <100.00%> (ø)
scrapy_poet/downloader.py 100.00% <100.00%> (ø)
scrapy_poet/middleware.py 100.00% <100.00%> (ø)
scrapy_poet/page_input_providers.py 95.00% <100.00%> (+1.00%) ⬆️
scrapy_poet/utils.py 100.00% <100.00%> (ø)

tox.ini Outdated Show resolved Hide resolved
scrapy_poet/middleware.py Outdated Show resolved Hide resolved
@BurnzZ BurnzZ changed the title integration for web-poet's support on additional requests integration for web-poet's support on additional requests and Meta Feb 18, 2022
CHANGELOG.rst Show resolved Hide resolved
@BurnzZ BurnzZ marked this pull request as ready for review March 15, 2022 07:08
@BurnzZ BurnzZ mentioned this pull request Mar 16, 2022
2 tasks
@BurnzZ BurnzZ force-pushed the po-additional-requests branch from 5eca70e to 8340ced Compare March 29, 2022 03:19
tests/test_backend.py Outdated Show resolved Hide resolved
Copy link
Member

@kmike kmike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 +1 to merge it after making sure it works with the latest web-poet (scrapinghub/web-poet#42 is merged).

@Gallaecio Gallaecio closed this Jun 9, 2022
@Gallaecio Gallaecio reopened this Jun 9, 2022
@Gallaecio
Copy link
Member

Closing and reopening to trigger tests with the latest web-poet master.

As for merging, we still need to wait for a web-poet release (0.2?), then update requirements here, re-run tests, and then merge.

@Gallaecio Gallaecio marked this pull request as draft June 9, 2022 09:20
@kmike
Copy link
Member

kmike commented Jun 11, 2022

@Gallaecio could you please check scrapinghub/web-poet#51 before making scrapy-poet release with these changes?

@Gallaecio
Copy link
Member

3398dc7 is not as trivial as the commit message may suggest:

@Gallaecio Gallaecio marked this pull request as ready for review June 15, 2022 08:50
tox.ini Outdated Show resolved Hide resolved
@kmike
Copy link
Member

kmike commented Jun 15, 2022

👍, +1 to merge.

Copy link
Contributor Author

@BurnzZ BurnzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got one minor change left and I believe we're good to go. 🚀

'scrapy >= 2.6.0',
'sqlitedict >= 1.5.0',
'url-matcher >= 0.2.0',
'web-poet >= 0.2.0',
Copy link
Contributor Author

@BurnzZ BurnzZ Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per scrapinghub/web-poet#51 (comment), we should probably make this into 0.3.0, as well as updating the CHANGELOG and tox.ini.

EDIT: I just realized we're not using contextvars in scrapy-poet so we can set it to 0.2.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I went through the same path. It was when I went to apply the contextvar name change to the patch when I realized we do not need 0.3.0 here.

@kmike kmike merged commit e4589e6 into master Jun 16, 2022
@kmike
Copy link
Member

kmike commented Jun 16, 2022

Thanks @BurnzZ and @Gallaecio, that's a big milestone!

@BurnzZ BurnzZ deleted the po-additional-requests branch October 14, 2022 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

callback_for does not produce a coroutine.
3 participants